Mining Multiple Private Databases using a Privacy Preserving kNN Classifier
نویسندگان
چکیده
Data mining technologies are popular for identifying interesting patterns and trends in large amounts of data. With the advent of high speed networks and easily available storage, many organizations are able to collect large amounts of data. On one hand, these organizations would like to mine their data to understand and discover interesting patterns; on the other hand, many legal and commercial reasons may prevent the organizations from sharing their data. In such situations, privacy preserving data mining tools become critical they allow data from many organizations to be mined securely and with a minimum of information disclosure. In this paper, we present a framework for mining multiple private databases using a privacy preserving k Nearest Neighbor (kNN) classifier. We develop a general model for privacy preserving kNN classification and present algorithms for realizing this model. We specify requirements that all privacy preserving classifiers should strive to achieve and analyze how well our algorithm achieves these requirements. This is the first paper to show how kNN classification can be achieved in a privacy preserving manner. A novel feature of our algorithm is that it offers a trade-off between accuracy, efficiency and privacy. Thus, it can be applied in a variety of problem settings and can meet different optimization criteria.
منابع مشابه
Mining Multiple Private Databases using a Privacy Preserving kNN Classifier
Data mining technologies are popular for identifying interesting patterns and trends in large amounts of data. With the advent of high speed networks and easily available storage, many organizations are able to collect large amounts of data. On one hand, these organizations would like to mine their data to understand and discover interesting patterns; on the other hand, many legal and commercia...
متن کاملPrivacy-Preserving Decision Tree Classification Over Horizontally Partitioned Data
Protection of privacy is one of important problems in data mining. The unwillingness to share their data frequently results in failure of collaborative data mining. This paper studies how to build a decision tree classifier under the following scenario: a database is horizontally partitioned into multiple pieces, with each piece owned by a particular party. All the parties want to build a decis...
متن کاملPrivacy Preserving Two-Layer Decision Tree Classifier for Multiparty Databases
Privacy protection is one of the important problems in data mining. The growth of the Internet has triggered incredible opportunities for cooperative computation, where people are jointly conducting computation tasks based on the private inputs they each supplies. These computations could occur between mutually un-trusted parties or even between competitors. Today, to conduct such computations,...
متن کاملPrivacy Preserving Aggregation of Secret Classifiers
In this paper, we address the issue of privacy preserving data-mining. Specifically, we consider a scenario where each member j of T parties has its own private database. The party j builds a private classifier hj for predicting a binary class variable y. The aim of this paper consists of aggregating these classifiers hj in order to improve individual predictions. More precisely, the parties wi...
متن کاملA review on Security in Distributed Information Sharing
In recent year’s privacy preserving data mining has emerged as a very active research area in data mining. Over the last few years this has naturally lead to a growing interest in security or privacy issues in data mining. More precisely, it became clear that discovering knowledge through a combination of different databases raises important security issues. Privacy preserving data mining is on...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2006